Reduce main thread dispatch if not needed#269
Open
MaxHeimbrock wants to merge 13 commits intomainfrom
Open
Conversation
Adds an opt-in `rawSafe` flag to `RegisterPendingCallback`. When set, `FFICallback` resolves the pending callback directly on the FFI callback thread instead of marshaling through Unity's main-thread SynchronizationContext. Saves up to one frame of latency on async ops whose completion only mutates volatile YieldInstruction state. `YieldInstruction.IsDone`/`IsError` are converted to volatile-backed properties so the release semantics of the completion's volatile write make any preceding state mutations visible to the main-thread reader once it observes IsDone == true. Opted in: the three generic instruction classes (`FfiInstruction<T>`, `FfiStreamInstruction<T>`, `FfiStreamResultInstruction<T,U>`) which only set IsDone/IsError/Error/ result and do not touch Unity APIs. This covers SetLocalName/Metadata/ Attributes, UnpublishTrack, all stream Read/Write/Close/WriteToFile. Not opted in: bespoke completion handlers (Connect, PublishTrack, GetStats, CaptureAudioFrame, PerformRpc, SendText, SendFile, TextStreamOpen, ByteStreamOpen, PublishDataTrack). Those create C# objects and/or fire user events and need a separate audit before they can move off the main thread. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Short-circuits ByteStreamReaderEvent in FFICallback so chunk-received events land in the incremental read buffer without waiting for the next main-thread frame drain. Mirrors the AudioStreamEvent fast path. The internal subscriber (ByteStreamReader.ReadIncrementalInstruction) forwards into ReadIncrementalInstructionBase, which now serializes mutations of the chunk queue, latest chunk, error, and the IsEos / IsCurrentReadDone flags under a single lock. The flag fields on StreamYieldInstruction are made volatile so the main-thread coroutine's keepWaiting poll observes the FFI thread's writes without acquiring the lock. Only ReadIncremental consumers benefit; ReadAll still goes through FfiStreamResultInstruction (already raw-safe via the previous commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the byte-stream fast path. TextStreamReader.ReadIncrementalInstruction shares ReadIncrementalInstructionBase<TContent> with the byte reader, so the lock added in the previous commit already covers text chunk mutations — only the FFICallback short-circuit needs to be added. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three new EditMode tests in DataStreamIncrementalReadTests covering the ReadIncrementalInstructionBase lock added for raw chunk dispatch: - producer/consumer with 5000 chunks, asserts FIFO and no loss - OnEos racing with chunks, asserts consistent post-race state - Reset racing with OnChunk, asserts no chunk lost (200 trials) YieldInstructionThreadingTests is a new file with three tests that guard the volatile semantics of YieldInstruction.IsDone / IsError: - background thread sets IsDone, foreground spin observes it - keepWaiting flips after a background completion - IsError write before IsDone is visible after the volatile read Each test has a Timeout attribute so a regression that breaks the volatile read or the lock fails the runner instead of hanging it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The [Timeout] attribute does not behave reliably in our CI runner. Removed it from all six threading tests and replaced the only unbounded Task.Wait() with a Wait(TimeSpan) + Assert.Fail pair so a genuine deadlock still produces a clean test failure rather than hanging until the outer CI wall-clock kills the runner. The in-test DateTime.UtcNow deadlines on the spin loops already convert hang regressions into descriptive Assert.Fail messages, so the [Timeout] attribute was redundant safety on top of those. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
base.Reset() writes IsCurrentReadDone=false outside the chunk lock because StreamYieldInstruction owns the field but doesn't know about the lock the subclass added. That left a window where a producer's OnChunk could acquire the lock between base.Reset() and Reset()'s re-acquisition, write _latestChunk and set IsCurrentReadDone=true, and have its chunk immediately overwritten by Reset()'s subsequent queue dequeue. The producer's chunk was lost. Empirically the race fired ~4% of the time under stress (150-300 chunks lost out of 5000 across 5 iterations of OnChunk_ConcurrentProducerAndConsumer_AllChunksObservedInOrder). Fix: move base.Reset() inside the lock so the entire IsCurrentReadDone false-then-maybe-true sequence is atomic against OnChunk. After fix: 5/5 iterations × 6 tests = 30/30 passing via Scripts~/run_unity.sh test -m EditMode -n 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier raw-safe commit only flipped FfiInstruction<T> and missed
the two sibling generic classes despite the commit message claiming
all three were opted in. They follow the same pattern: only mutate
volatile YieldInstruction state plus an Error / _result reference set
strictly before IsDone, no Unity APIs touched.
Adds rawSafe: true to:
- FfiStreamInstruction<TCallback> — covers
ByteStreamWriter.WriteInstruction / CloseInstruction and
TextStreamWriter.WriteInstruction / CloseInstruction
- FfiStreamResultInstruction<TCallback,TResult> — covers
ByteStreamReader.ReadAllInstruction / WriteToFileInstruction and
TextStreamReader.ReadAllInstruction
Verified: 5/5 iterations × 6 threading tests = 30/30 passing via
Scripts~/run_unity.sh test -m EditMode -n 5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing threading tests verify cross-thread state visibility
(volatile IsDone, lock around chunk mutations) but cannot detect a
regression that silently routed raw-safe completions back through
SynchronizationContext.Post — the volatile/lock work is correct in
either case.
Adds three EditMode tests that prove the FFI-thread fast path is
actually taken:
RawSafeTrueCallback_RunsOnDispatchingBackgroundThread
Registers a rawSafe:true pending callback, dispatches via
TryDispatchRawSafe from a freshly-created Thread, asserts the
completion captured the dispatcher's ManagedThreadId — not the
test main thread's. This is the load-bearing test.
RawSafeFalseCallback_TryDispatchRawSafe_ReturnsFalseAndDoesNotComplete
Negative control: TryDispatchRawSafe returns false for entries
registered with rawSafe:false and does not invoke the completion.
TryDispatchPendingCallback_RunsCompletionSynchronouslyOnCallerThread
Sanity check that the underlying race-proof dispatch is
synchronous on the caller's thread regardless of rawSafe — what
makes the rawSafe path "raw" is that FFICallback (the FFI thread)
is the caller, not the dispatch mechanism itself.
Two FfiClient methods (TryDispatchRawSafe, TryDispatchPendingCallback)
go from private to internal so the tests can invoke them directly.
The race-semantics comments are unchanged.
Verified: 3 tests x 5 iterations = 15/15 passing via
Scripts~/run_unity.sh test -m EditMode -n 5 -f RawSafeDispatch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier RawSafeDispatchTests called TryDispatchRawSafe directly,
which proves the inner method is synchronous on the caller's thread
but does not prove FFICallback actually invokes it for raw-safe
entries. Commenting out the if-block in FFICallback would not have
made any of those tests fail.
Refactor: extract everything in FFICallback after the byte parse into
RouteFfiEvent(FfiEvent). FFICallback shrinks to disposed-check, parse,
RouteFfiEvent. Production traffic still always lands in FFICallback;
tests can now drive RouteFfiEvent directly from a chosen thread
without P/Invoke pointer juggling.
Two new integration tests:
RouteFfiEvent_RawSafeTrue_CompletesOnDispatchingThread
Calls RouteFfiEvent from a fresh Thread, asserts the rawSafe
completion captured the dispatcher thread's id (not the test
main thread). Verified to fail with a clear message when the
TryDispatchRawSafe call is removed from RouteFfiEvent.
RouteFfiEvent_RawSafeFalse_PostsToSynchronizationContext_AndDrainsCompletion
Swaps FfiClient.Instance._context for a RecordingSynchronizationContext,
dispatches a non-raw entry, asserts the completion was NOT run on
the dispatcher thread, asserts exactly one item was posted to the
SC, then drains the queue from the test thread and asserts the
completion runs on the drainer thread.
Verified: 5 tests x 5 iterations = 25/25 passing via
Scripts~/run_unity.sh test -m EditMode -n 5 -f RawSafeDispatch.
Negative control: with the rawSafe block in RouteFfiEvent commented
out, the new integration test fails with:
"Completion did not run — RouteFfiEvent may be missing the
TryDispatchRawSafe call for rawSafe entries."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AudioStreamEvent, ByteStreamReaderEvent, and TextStreamReaderEvent all early-return inside RouteFfiEvent before reaching the SynchronizationContext.Post that drains into DispatchEvent. The switch arms for those three events were therefore unreachable — DispatchEvent is private and only the post lambda calls it. Replaced with a single comment pointing readers at the fast-path returns. AudioStreamEvent's case had been dead since the audio fast-path was added (predates this branch); the other two became dead in cc2ed4f / 967d0b9. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds Logs to RouteFfiEvent's fast-path early-returns alongside the audio and stream-reader events. UnityEngine.Debug.unityLogger is one of the few Unity APIs documented thread-safe — it queues to the console drain internally — so calling Utils.HandleLogBatch from the FFI thread is safe. Skipping the main-thread post matters most during error storms, panics, and LK_VERBOSE noise where Rust may emit log batches faster than Unity drains its post queue. Logs now reach the console without a one-frame delay even if the main thread is busy or wedged. There are no public subscribers for the Logs event, so this has no effect on user-facing surfaces. The unreachable Logs case in DispatchEvent's switch is removed (now covered by the comment that already documents the other fast-path events). Verified: 11 tests x 5 iterations = 55/55 passing via Scripts~/run_unity.sh test -m EditMode -n 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The rawSafe flag described a capability ("safe to run on the FFI
thread") but its name was opaque, and TryDispatchRawSafe was a
misnomer — the method doesn't dispatch anything; it runs the
completion inline on the caller's thread. Rename for clarity:
rawSafe: false (default) → dispatchToMainThread: true (default)
rawSafe: true → dispatchToMainThread: false
TryDispatchRawSafe → TrySkipDispatch
PendingCallbackBase.RawSafe → DispatchToMainThread
The default flips polarity but the meaning is the same: by default,
completions go through SynchronizationContext.Post (safe for any
handler that touches Unity APIs); the four generic FfiInstruction*
classes opt out by passing dispatchToMainThread:false because their
onComplete only mutates volatile state.
TrySkipDispatch reads correctly at the call site:
if (Instance.TrySkipDispatch(...)) return;
// fall through to dispatch
Instance._context?.Post(...);
— "try to skip the dispatch; if we can't, dispatch normally."
TryDispatchPendingCallback keeps its name: "dispatch" there means
function-dispatch (route the event to its registered handler), a
different sense of the word and at a different abstraction level.
Test file renamed RawSafeDispatchTests.cs → SkipDispatchTests.cs
(GUID preserved so Unity sees the rename, not delete+add). Test
class and method names updated to match the new vocabulary.
Verified: 11 tests x 5 iterations = 55/55 passing via
Scripts~/run_unity.sh test -m EditMode -n 5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 30, 2026
Closed
Contributor
|
@MaxHeimbrock , I like the PR, what do you think of documenting those events' thread model in the README, like adding a event thread model section with the table you have ? |
xianshijing-lk
approved these changes
May 5, 2026
Contributor
xianshijing-lk
left a comment
There was a problem hiding this comment.
lgtm, but lets put those thread model into a doc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Move latency-sensitive FFI events off the main thread
Summary
Internal threading change. FFI events whose handlers don't touch Unity APIs now run directly on the FFI callback thread instead of being marshaled to Unity's main thread via
SynchronizationContext.Post. Saves up to one frame of latency on async ops, log delivery, and stream chunk arrival, and removes a small but real allocation per event from the main-thread queue.No public API changes. The user-facing threading contract is unchanged:
Room.ParticipantConnected,TrackPublished,VideoStream.TextureReceived, RPC handlers,Disconnect, etc. all still fire on the main thread.Why
FFICallbackpreviously routed every Rust event (exceptAudioStreamEvent) through_context.Postto Unity's main thread. That's the safe default for handlers that touch Unity APIs (Texture2D,GameObject,Transform, …) but it costs one frame of latency for handlers that don't. Three categories of events are silently paying that cost without benefit:IsDoneon aYieldInstruction—SetMetadata,UnpublishTrack, all streamRead/Write/Closeops.UnityEngine.Debug.unityLoggeris documented thread-safe; the post hop adds latency without benefit, especially during error storms orLK_VERBOSEnoise.This PR moves only those off the main thread. Bespoke completion handlers (anything that fires user events or constructs Unity objects) keep main-thread dispatch.
Event dispatch matrix
After this PR:
AudioStreamEventLogsDebug.unityLoggeris thread-safe; logs reach console immediately during panics / errorsByteStreamReaderEventTextStreamReaderEventReadIncrementalInstructionBase)FfiInstruction<T>SetLocalMetadata,SetLocalName,SetLocalAttributes,UnpublishTrack— only flipIsDone/IsErrorFfiStreamInstruction<T>ByteStreamWriter.Write/Close,TextStreamWriter.Write/CloseFfiStreamResultInstruction<T,U>ByteStreamReader.ReadAll/WriteToFile,TextStreamReader.ReadAllRoomEventParticipantConnected,TrackPublished, etc.TrackEventRpcMethodInvocationDisconnectVideoStreamEventDataTrackStreamEventConnect(one-shot)PublishTrack(one-shot)GetStats(one-shot)CaptureAudioFrame(one-shot)PerformRpc(one-shot)SendText/SendFile(one-shot)TextStreamOpen/ByteStreamOpen(one-shot)PublishDataTrack(one-shot)Decision logic at runtime
FFICallbacknow only parses bytes and delegates toRouteFfiEvent. The routing decision lives in one place (Runtime/Scripts/Internal/FFIClient.cs):TrySkipDispatchlooks up the registered pending callback and only handles the event inline if the callback opted out of main-thread dispatch:Three exit points:
DispatchToMainThread == true→ false. Caller falls through to_context.Post.DispatchToMainThread == false→ run completion inline, return true.Opt-in flow at registration
The default is conservative — every
RegisterPendingCallbackcallsite implicitly requires main-thread dispatch:The three generic
FfiInstruction*classes opt out, because theironCompleteonly mutates volatileYieldInstructionstate:The ten bespoke registration sites (Connect, PublishTrack, GetStats, …) keep the default. They fire user events, construct C# objects, or otherwise touch Unity-visible state.
To make
YieldInstructionsafe to read across threads,IsDoneandIsError(and the correspondingIsEos/IsCurrentReadDoneonStreamYieldInstruction) are backed byvolatilefields. The release semantics of theIsDonewrite make any preceding state mutations (Error,_result, …) visible to the main-thread reader after it observesIsDone == true.Tests
Eleven new EditMode tests, all bounded with explicit
DateTime.UtcNowdeadlines andThread.Join(TimeSpan)/ManualResetEventSlim.Wait(TimeSpan)so a deadlock or volatile regression fails fast with a descriptiveAssert.Failrather than hanging the runner.Tests/EditMode/YieldInstructionThreadingTests.csvolatileIsDone/IsError;keepWaitingflips after a background completionTests/EditMode/DataStreamIncrementalReadTests.cs(extended)OnEosracing with chunks;Resetracing withOnChunk(200 trials)Tests/EditMode/SkipDispatchTests.csSynchronizationContext.Post; integration viaRouteFfiEventThe
RouteFfiEventintegration tests are the load-bearing ones. They drive the same code pathFFICallbackuses, so a future refactor that drops theTrySkipDispatchshort-circuit fails them with a clear message:Verified by negative control: temporarily commenting out the
TrySkipDispatchcall inRouteFfiEventreproduced exactly that failure on the first run.The non-raw integration test (
RouteFfiEvent_MainThreadCallback_PostsToSynchronizationContext_AndDrainsCompletion) installs aRecordingSynchronizationContextasFfiClient._contextfor the test duration, callsRouteFfiEventfrom a background thread, and asserts: (a) the completion did not run on the dispatcher thread, (b) exactly one item was posted, (c) draining the queue from the test thread runs the completion on the drainer thread.